Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

Published in preprint, 2023